This chapter covers data analysis of public bike sharing systems around the world. It’s the first of a few chapters covering different transport means, such as airplanes, busese, cars and others. The idea of this project is to apply data analysis techniques to real world data sets. In this first chapter I will show a full data analysis cycle, starting in gathering data and ending by creating a prediction model of public bike sharing system capacity in London city.
This document is built using markdown syntax and then generated HTML code for publishing the whole on the Internet. The flow is basically splitted in three parts, gathering, storing and modeling data.
The following code is coming from get_citybik.R In order to gather data, I will be using CityBikes API www.citybik.es where you can access in real time to data of dozens of public bike sharing systems around the world.
First we need to set our current working directory. Beware of changing this working directory if you happen to be working with these data by your own.
Get a list of all networks using the network endpoint. Basically I want to know which bike systems are available using this endpoint.
# Get all networks involved
network_endpoint = "http://api.citybik.es/v2/networks"
network_endpoint_df <- jsonlite::fromJSON(network_endpoint, simplifyDataFrame = TRUE)
Get valid data from the list of all networks and incorporate a column with the suffix URL.
# Refine endpoint output
company_names = network_endpoint_df$networks
company_ids = company_names$id
company_data = company_names$location
company_url = company_names$href
# Build a suffix to endpoint
suffix = "http://api.citybik.es"
company_url = paste(suffix,company_names$href,sep="")
company_df = cbind(company_ids,company_data,company_url)
This is the resulting table:
A subset of company_df.
| company_ids | city | country | latitude | longitude | company_url |
|---|---|---|---|---|---|
| opole-bike | Opole | PL | 50.66450 | 17.92760 | http://api.citybik.es/v2/networks/opole-bike |
| wroclawski-rower-miejski | Wrocław | PL | 51.10970 | 17.04850 | http://api.citybik.es/v2/networks/wroclawski-rower-miejski |
| veturilo | Warszawa | PL | 52.23450 | 21.00240 | http://api.citybik.es/v2/networks/veturilo |
| bubi | Budapest | HU | 47.49791 | 19.04023 | http://api.citybik.es/v2/networks/bubi |
| kvb-rad | Köln | DE | 50.93640 | 6.96053 | http://api.citybik.es/v2/networks/kvb-rad |
| bemowo-bike | Warszawa | PL | 52.26060 | 20.92900 | http://api.citybik.es/v2/networks/bemowo-bike |
Export a CSV file with the previous dataframe just in case we need this information at any time.
# Export list of companies
write.csv(company_df,file = "list_of_bikes_system_worldwide.csv",row.names = FALSE)
Now I am going to create a function to get data from all our systems.
A subset of company_df.
| company_ids | city | country | latitude | longitude | company_url |
|---|---|---|---|---|---|
| opole-bike | Opole | PL | 50.66450 | 17.92760 | http://api.citybik.es/v2/networks/opole-bike |
| wroclawski-rower-miejski | Wrocław | PL | 51.10970 | 17.04850 | http://api.citybik.es/v2/networks/wroclawski-rower-miejski |
| veturilo | Warszawa | PL | 52.23450 | 21.00240 | http://api.citybik.es/v2/networks/veturilo |
| bubi | Budapest | HU | 47.49791 | 19.04023 | http://api.citybik.es/v2/networks/bubi |
| kvb-rad | Köln | DE | 50.93640 | 6.96053 | http://api.citybik.es/v2/networks/kvb-rad |
| bemowo-bike | Warszawa | PL | 52.26060 | 20.92900 | http://api.citybik.es/v2/networks/bemowo-bike |
I will start with one system and then apply the same function to a set of cities, although you could do it for the >400 cities included in this magnificent API.
# 2.- For each system, get data (build DB and cron job)
# Build a function to read, parse and store endpoint data
bike_input = "http://api.citybik.es/v2/networks/wroclawski-rower-miejski"
# Read one system
bike_input_df = jsonlite::fromJSON(bike_input, simplifyDataFrame = TRUE)
bike_input_df_2 = bike_input_df$network
bike_input_df_3 = bike_input_df_2$stations
bike_df = bike_input_df_3[c(1,3:8)]
total_slots = bike_input_df_3$extra$slots
bike_df$total_slots = total_slots
The resulting table of capacity data in a given city:
A subset of bike_df.
| empty_slots | free_bikes | id | latitude | longitude | name | timestamp | total_slots |
|---|---|---|---|---|---|---|---|
| 15 | 0 | 52515b944e4a64584def861922aa6db1 | 51.12528 | 16.98439 | Legnicka - Wejherowska | 2016-12-28T16:53:54.843000Z | 15 |
| 15 | 0 | dfe66f3e0cb8ac29b2386df9d68341d7 | 51.13208 | 17.06550 | Pl. Kromera | 2016-12-28T16:53:54.800000Z | 15 |
| 13 | 0 | 0138e16d65d5db5b6e15970b73686b1a | 51.10723 | 17.06133 | Politechnika Wrocławska - Gmach Główny | 2016-12-28T16:53:54.822000Z | 14 |
| 15 | 0 | 7daa95e8b08a9d4b1401d5d7ed71e5c7 | 51.10098 | 17.00830 | Pereca - Grabiszyńska | 2016-12-28T16:53:54.832000Z | 15 |
| 15 | 0 | ae5d14252719f50d17084268b9906747 | 51.09804 | 17.00680 | Żelazna - Pereca | 2016-12-28T16:53:54.843000Z | 15 |
| 15 | 0 | d56c50f2935797bcffcba9f4237ded56 | 51.11943 | 17.05151 | Wyszyńskiego - Prusa | 2016-12-28T16:53:54.852000Z | 15 |
You can see some of the cities included in the API.
# Get id + url
input_array = company_df[c(1,6)]
A subset of input_array.
| company_ids | company_url |
|---|---|
| opole-bike | http://api.citybik.es/v2/networks/opole-bike |
| wroclawski-rower-miejski | http://api.citybik.es/v2/networks/wroclawski-rower-miejski |
| veturilo | http://api.citybik.es/v2/networks/veturilo |
| bubi | http://api.citybik.es/v2/networks/bubi |
| kvb-rad | http://api.citybik.es/v2/networks/kvb-rad |
| bemowo-bike | http://api.citybik.es/v2/networks/bemowo-bike |
I will build two arrays with key information, the city id and its API url, which are needed to build the final query.
A subset of array_id.
| array_id |
|---|
| opole-bike |
| wroclawski-rower-miejski |
| veturilo |
| bubi |
| kvb-rad |
| bemowo-bike |
A subset of array_url.
As previously shown, let’s create a function to gather data from one city, and then apply that same function to more than one city.
# Create the function for one system
foo = function(url,id){
input = as.character(url)
read_input = jsonlite::fromJSON(input, simplifyDataFrame = TRUE)
input_df_2 = read_input$network
input_df_3 = input_df_2$stations
input_df = input_df_3[c(1,3:8)]
input_df$company_id = as.character(id)
return (input_df)
}
Now it’s time to test the function in one city and then to a small amouny of cities and ultimately, if you wish, to the whole array of cities.
# Test fuction for one:
aa = array_url[2, ]
aa
## [1] http://api.citybik.es/v2/networks/wroclawski-rower-miejski
## 457 Levels: http://api.citybik.es/v2/networks/adcb-bikeshare ...
a = foo(aa,"wroclawski-rower-miejski")
A subset of one system output.
| empty_slots | free_bikes | id | latitude | longitude | name | timestamp | company_id |
|---|---|---|---|---|---|---|---|
| 15 | 0 | 52515b944e4a64584def861922aa6db1 | 51.12528 | 16.98439 | Legnicka - Wejherowska | 2016-12-28T16:56:58.738000Z | wroclawski-rower-miejski |
| 15 | 0 | dfe66f3e0cb8ac29b2386df9d68341d7 | 51.13208 | 17.06550 | Pl. Kromera | 2016-12-28T16:56:58.703000Z | wroclawski-rower-miejski |
| 13 | 0 | 0138e16d65d5db5b6e15970b73686b1a | 51.10723 | 17.06133 | Politechnika Wrocławska - Gmach Główny | 2016-12-28T16:56:58.726000Z | wroclawski-rower-miejski |
| 15 | 0 | 7daa95e8b08a9d4b1401d5d7ed71e5c7 | 51.10098 | 17.00830 | Pereca - Grabiszyńska | 2016-12-28T16:56:58.732000Z | wroclawski-rower-miejski |
| 15 | 0 | ae5d14252719f50d17084268b9906747 | 51.09804 | 17.00680 | Żelazna - Pereca | 2016-12-28T16:56:58.736000Z | wroclawski-rower-miejski |
| 15 | 0 | d56c50f2935797bcffcba9f4237ded56 | 51.11943 | 17.05151 | Wyszyńskiego - Prusa | 2016-12-28T16:56:58.751000Z | wroclawski-rower-miejski |
The next rows are meant to apply the foo (get data from one system) function to the 452 systems available in Citybikes.
# Apply function to all the bike systems
# u = array_url[1:452,]
# i = array_id[1:452,]
#
# foo_all <- lapply(u,foo,i)
So the above chunk shows the techniques to play with the API. Now I will start to get real time data of eight european cities each hour and store that data in a SQLite Database.
Using the function created before, I test first one city.
# Test one city.
u = "http://api.citybik.es/v2/networks/dublinbikes"
i = "dublin"
city_ind = foo(u,i)
A subset of one city_ind.
| empty_slots | free_bikes | id | latitude | longitude | name | timestamp | company_id |
|---|---|---|---|---|---|---|---|
| 30 | 5 | b1c774f178240d5d757854364dafc840 | 53.34143 | -6.246720 | FENIAN STREET | 2016-12-28T16:56:45.830000Z | dublin |
| 29 | 1 | 135d1f3410c5dbcc49c55a1d522dbbe6 | 53.34664 | -6.246154 | CITY QUAY | 2016-12-28T16:56:45.844000Z | dublin |
| 34 | 6 | 153ff4dfb7bd8912ef91c10849129c2e | 53.33521 | -6.250900 | FITZWILLIAM SQUARE EAST | 2016-12-28T16:56:45.893000Z | dublin |
| 21 | 9 | 516ee41c56889c29b8fa5f4ac0e824f4 | 53.33901 | -6.300217 | BROOKFIELD ROAD | 2016-12-28T16:56:45.815000Z | dublin |
| 27 | 13 | 6bff19daf1747a1e956a899ef8021232 | 53.34071 | -6.308191 | EMMET ROAD | 2016-12-28T16:56:45.896000Z | dublin |
| 20 | 15 | 11b359924d83d802a41856c405d13e61 | 53.33878 | -6.303950 | ROTHE ABBEY | 2016-12-28T16:56:45.891000Z | dublin |
Then I create (just the first time) the SQLite database nd store the results for future analysis.
Set up a cron job to get and store data for the selected list of systems with the following code (see store_citybik.R)
In this following code chunk you will see the case of one city (Dublin) but if you go to store_citybik.R you will see how I got data from eight different cities in the cron job.
## [1] 101 8
A subset of one city_ind.
| empty_slots | free_bikes | id | latitude | longitude | name | timestamp | company_id |
|---|---|---|---|---|---|---|---|
| 30 | 5 | b1c774f178240d5d757854364dafc840 | 53.34143 | -6.246720 | FENIAN STREET | 2016-12-28T16:56:45.830000Z | dublin |
| 29 | 1 | 135d1f3410c5dbcc49c55a1d522dbbe6 | 53.34664 | -6.246154 | CITY QUAY | 2016-12-28T16:56:45.844000Z | dublin |
| 34 | 6 | 153ff4dfb7bd8912ef91c10849129c2e | 53.33521 | -6.250900 | FITZWILLIAM SQUARE EAST | 2016-12-28T16:56:45.893000Z | dublin |
| 21 | 9 | 516ee41c56889c29b8fa5f4ac0e824f4 | 53.33901 | -6.300217 | BROOKFIELD ROAD | 2016-12-28T16:56:45.815000Z | dublin |
| 27 | 13 | 6bff19daf1747a1e956a899ef8021232 | 53.34071 | -6.308191 | EMMET ROAD | 2016-12-28T16:56:45.896000Z | dublin |
| 20 | 15 | 11b359924d83d802a41856c405d13e61 | 53.33878 | -6.303950 | ROTHE ABBEY | 2016-12-28T16:56:45.891000Z | dublin |
Once you got data, you need to insert it into the database.
## [1] TRUE
The final part of this process is coming from model_bikes.R I have been gathering data from those eight cities for a couple of months, therefore I have a rich dataset which in reality is a bunch of datasets that can be used to apply data analysis to any of those european cities.
The first thing to do is to connect to the database where I stored data and get it all.
# Open connection
con <- dbConnect(SQLite(),dbname="bikes.sqlite3")
my_db <- src_sqlite( "bikes.sqlite3", create = FALSE)
# Get all data
all_data = dbGetQuery(con,"SELECT * FROM bikes_records_2")
str(all_data)
## 'data.frame': 1075007 obs. of 8 variables:
## $ empty_slots: int 15 3 8 27 36 34 30 17 40 39 ...
## $ free_bikes : int 20 26 32 0 4 1 0 3 0 1 ...
## $ id : chr "b1c774f178240d5d757854364dafc840" "135d1f3410c5dbcc49c55a1d522dbbe6" "153ff4dfb7bd8912ef91c10849129c2e" "516ee41c56889c29b8fa5f4ac0e824f4" ...
## $ latitude : num 53.3 53.3 53.3 53.3 53.3 ...
## $ longitude : num -6.25 -6.25 -6.25 -6.3 -6.31 ...
## $ name : chr "FENIAN STREET" "CITY QUAY" "FITZWILLIAM SQUARE EAST" "BROOKFIELD ROAD" ...
## $ timestamp : chr "2016-10-17T12:47:15.653000Z" "2016-10-17T12:47:15.673000Z" "2016-10-17T12:47:15.755000Z" "2016-10-17T12:47:15.625000Z" ...
## $ company_id : chr "dublin" "dublin" "dublin" "dublin" ...
A subset of all_data.
| empty_slots | free_bikes | id | latitude | longitude | name | timestamp | company_id |
|---|---|---|---|---|---|---|---|
| 15 | 20 | b1c774f178240d5d757854364dafc840 | 53.34143 | -6.246720 | FENIAN STREET | 2016-10-17T12:47:15.653000Z | dublin |
| 3 | 26 | 135d1f3410c5dbcc49c55a1d522dbbe6 | 53.34664 | -6.246154 | CITY QUAY | 2016-10-17T12:47:15.673000Z | dublin |
| 8 | 32 | 153ff4dfb7bd8912ef91c10849129c2e | 53.33521 | -6.250900 | FITZWILLIAM SQUARE EAST | 2016-10-17T12:47:15.755000Z | dublin |
| 27 | 0 | 516ee41c56889c29b8fa5f4ac0e824f4 | 53.33901 | -6.300217 | BROOKFIELD ROAD | 2016-10-17T12:47:15.625000Z | dublin |
| 36 | 4 | 6bff19daf1747a1e956a899ef8021232 | 53.34071 | -6.308191 | EMMET ROAD | 2016-10-17T12:47:15.766000Z | dublin |
| 34 | 1 | 11b359924d83d802a41856c405d13e61 | 53.33878 | -6.303950 | ROTHE ABBEY | 2016-10-17T12:47:15.744000Z | dublin |
Let’s apply some time transformations.
# Insert dates columns
#all_data$timestamp = as.Date(all_data$timestamp)
all_data$time_2 = ymd_hms(all_data$timestamp)
all_data$week_day = wday(all_data$time_2,label = TRUE)
all_data$hour = hour(all_data$time_2)
all_data$shift = cut(all_data$hour,c(0,7,15,23),c("night","morning","afternoon"))
A subset of all_data.
| empty_slots | free_bikes | id | latitude | longitude | name | timestamp | company_id | time_2 | week_day | hour | shift |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 15 | 20 | b1c774f178240d5d757854364dafc840 | 53.34143 | -6.246720 | FENIAN STREET | 2016-10-17T12:47:15.653000Z | dublin | 2016-10-17 12:47:15 | Mon | 12 | morning |
| 3 | 26 | 135d1f3410c5dbcc49c55a1d522dbbe6 | 53.34664 | -6.246154 | CITY QUAY | 2016-10-17T12:47:15.673000Z | dublin | 2016-10-17 12:47:15 | Mon | 12 | morning |
| 8 | 32 | 153ff4dfb7bd8912ef91c10849129c2e | 53.33521 | -6.250900 | FITZWILLIAM SQUARE EAST | 2016-10-17T12:47:15.755000Z | dublin | 2016-10-17 12:47:15 | Mon | 12 | morning |
| 27 | 0 | 516ee41c56889c29b8fa5f4ac0e824f4 | 53.33901 | -6.300217 | BROOKFIELD ROAD | 2016-10-17T12:47:15.625000Z | dublin | 2016-10-17 12:47:15 | Mon | 12 | morning |
| 36 | 4 | 6bff19daf1747a1e956a899ef8021232 | 53.34071 | -6.308191 | EMMET ROAD | 2016-10-17T12:47:15.766000Z | dublin | 2016-10-17 12:47:15 | Mon | 12 | morning |
| 34 | 1 | 11b359924d83d802a41856c405d13e61 | 53.33878 | -6.303950 | ROTHE ABBEY | 2016-10-17T12:47:15.744000Z | dublin | 2016-10-17 12:47:15 | Mon | 12 | morning |
We can define a capacity status in terms of how many bikes are being used in a given station. We define a station as overflow when there is few empty slots to park. As far as I have seen is other similar analysis, thresholds to score a station as overflow, balanced or shortage are subjective. In this analysis, stations with empty slots from 0% to 25% are scored as overflow. Stations with empty slots from 75% to 100% are scored as shortage and everything in between is balanced.
# Total slots ans shifts calculations
all_data$total_slots = all_data$empty_slots + all_data$free_bikes
all_data$empty_status = (all_data$empty_slots / all_data$total_slots) * 100
all_data$empty_status_label = cut(all_data$empty_status,c(0,25,75,100),c("overflow","balanced","shortage"))
all_data$empty_status_label[all_data$empty_status==0] <- "overflow"
A subset of all_data.
| empty_slots | free_bikes | id | latitude | longitude | name | timestamp | company_id | time_2 | week_day | hour | shift | total_slots | empty_status | empty_status_label |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 15 | 20 | b1c774f178240d5d757854364dafc840 | 53.34143 | -6.246720 | FENIAN STREET | 2016-10-17T12:47:15.653000Z | dublin | 2016-10-17 12:47:15 | Mon | 12 | morning | 35 | 42.85714 | balanced |
| 3 | 26 | 135d1f3410c5dbcc49c55a1d522dbbe6 | 53.34664 | -6.246154 | CITY QUAY | 2016-10-17T12:47:15.673000Z | dublin | 2016-10-17 12:47:15 | Mon | 12 | morning | 29 | 10.34483 | overflow |
| 8 | 32 | 153ff4dfb7bd8912ef91c10849129c2e | 53.33521 | -6.250900 | FITZWILLIAM SQUARE EAST | 2016-10-17T12:47:15.755000Z | dublin | 2016-10-17 12:47:15 | Mon | 12 | morning | 40 | 20.00000 | overflow |
| 27 | 0 | 516ee41c56889c29b8fa5f4ac0e824f4 | 53.33901 | -6.300217 | BROOKFIELD ROAD | 2016-10-17T12:47:15.625000Z | dublin | 2016-10-17 12:47:15 | Mon | 12 | morning | 27 | 100.00000 | shortage |
| 36 | 4 | 6bff19daf1747a1e956a899ef8021232 | 53.34071 | -6.308191 | EMMET ROAD | 2016-10-17T12:47:15.766000Z | dublin | 2016-10-17 12:47:15 | Mon | 12 | morning | 40 | 90.00000 | shortage |
| 34 | 1 | 11b359924d83d802a41856c405d13e61 | 53.33878 | -6.303950 | ROTHE ABBEY | 2016-10-17T12:47:15.744000Z | dublin | 2016-10-17 12:47:15 | Mon | 12 | morning | 35 | 97.14286 | shortage |
Now it’s time to focus on a single city, in this case, London.
# One City
ci = "london" # line 38 in model bikes.R
ci_df = all_data %>%
filter(company_id == ci)
dim(ci_df)
## [1] 274291 15
ci_df = na.omit(ci_df)
dim(ci_df)
## [1] 271950 15
A subset of ci_df.
| empty_slots | free_bikes | id | latitude | longitude | name | timestamp | company_id | time_2 | week_day | hour | shift | total_slots | empty_status | empty_status_label |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 5 | 11 | d79e95513c3b5e1d3c78a65874054481 | 51.49887 | -0.1374246 | 200178 - Buckingham Gate, Westminster | 2016-10-17T14:08:07.773000Z | london | 2016-10-17 14:08:07 | Mon | 14 | morning | 16 | 31.25000 | balanced |
| 20 | 0 | 090f297e79c030688a8a51755ebeafe3 | 51.50172 | -0.1798542 | 001231 - Queen’s Gate, Kensington Gardens | 2016-10-17T14:08:06.788000Z | london | 2016-10-17 14:08:06 | Mon | 14 | morning | 20 | 100.00000 | shortage |
| 9 | 9 | 7f3020118e56165ed8b2f61899edb971 | 51.52916 | -0.1099705 | 001023 - River Street , Clerkenwell | 2016-10-17T14:08:04.972000Z | london | 2016-10-17 14:08:04 | Mon | 14 | morning | 18 | 50.00000 | balanced |
| 13 | 22 | 67e6c16bce05410ba4b1f0f5000726ea | 51.49961 | -0.1975742 | 001018 - Phillimore Gardens, Kensington | 2016-10-17T14:08:04.980000Z | london | 2016-10-17 14:08:04 | Mon | 14 | morning | 35 | 37.14286 | balanced |
| 15 | 8 | d0ca154129878ce71261d563d8da9238 | 51.53006 | -0.1209737 | 001013 - St. Chad’s Street, King’s Cross | 2016-10-17T14:08:04.989000Z | london | 2016-10-17 14:08:04 | Mon | 14 | morning | 23 | 65.21739 | balanced |
| 7 | 19 | 8e74e96bcc8d74d7621f8201cee921dd | 51.49313 | -0.1568760 | 003420 - Sedding Street, Sloane Square | 2016-10-17T14:08:04.993000Z | london | 2016-10-17 14:08:04 | Mon | 14 | morning | 26 | 26.92308 | balanced |
Mean capacity by weekday, hour and station.
A subset of ci_df_ref.
| week_day | hour | name | avg_used_capacity |
|---|---|---|---|
| Sun | 8 | 000970 - Scala Street, Fitzrovia | 100 |
| Sun | 8 | 000999 - Queen Street 1, Bank | 100 |
| Sun | 8 | 001006 - Bayley Street , Bloomsbury | 100 |
| Sun | 8 | 001016 - Hereford Road, Bayswater | 100 |
| Sun | 8 | 001043 - Museum of London, Barbican | 100 |
| Sun | 8 | 001048 - Queen Victoria Street, St. Paul’s | 100 |
We can apply a rank technique to get X number of stations:
A subset of top_stations_weekday.
| week_day | hour | name | avg_used_capacity |
|---|---|---|---|
| Sun | 8 | 000970 - Scala Street, Fitzrovia | 100 |
| Sun | 8 | 000999 - Queen Street 1, Bank | 100 |
| Sun | 8 | 001006 - Bayley Street , Bloomsbury | 100 |
| Sun | 8 | 001016 - Hereford Road, Bayswater | 100 |
| Sun | 8 | 001043 - Museum of London, Barbican | 100 |
| Sun | 8 | 001048 - Queen Victoria Street, St. Paul’s | 100 |
A subset of bottom_stations_weekday.
| week_day | hour | name | avg_used_capacity | |
|---|---|---|---|---|
| 85211 | Fri | 4 | 200086 - Ilchester Gardens, Bayswater | 0 |
| 85212 | Fri | 4 | 200105 - Bonny Street, Camden Town | 0 |
| 85213 | Fri | 4 | 200118 - Parkway, Camden Town | 0 |
| 85214 | Fri | 4 | 200146 - Newby Place, Poplar | 0 |
| 85215 | Fri | 4 | 200182 - Rainville Road, Hammersmith | 0 |
| 85216 | Fri | 4 | 200211 - Union Grove, Wandsworth Road | 0 |
Although it’s simple to get that top and bottom used stations grouping just by station:
See top and bottom used stations:
## name avg_used_capacity
## 1 300079 - Manfred Road, East Putney 91.54527
## 2 200177 - Santos Road, Wandsworth 91.00370
## 3 300046 - Alma Road, Wandsworth 90.65369
## 4 200174 - Mostyn Grove, Bow 89.82960
## 5 300026 - All Saints' Road, Portobello 89.63444
## 6 200063 - Mexfield Road, East Putney 89.46055
## 7 200220 - Blythe Road West, Shepherd's Bush 88.22927
## 8 200186 - Nantes Close, Wandsworth 88.12582
## 9 200219 - Osiers Road, Wandsworth 88.06595
## 10 200098 - Aintree Street, Fulham 87.16813
## name avg_used_capacity
## 769 022183 - Crinan Street, King's Cross 26.13413
## 770 300234 - Stratford Station, Stratford 25.49022
## 771 001114 - Pont Street, Knightsbridge 25.47857
## 772 002695 - Cadogan Gardens, Chelsea 24.60526
## 773 200238 - Westferry DLR, Limehouse 24.47227
## 774 200050 - Naval Row, Blackwall 24.27386
## 775 200193 - Snowsfields, London Bridge 23.56908
## 776 300022 - Evesham Street, Avondale 21.96792
## 777 200118 - Parkway, Camden Town 21.89531
## 778 003420 - Sedding Street, Sloane Square 18.56592
If we want to have a better view, we can geocode those stations.
A subset of top_stations_geo.
| name | avg_used_capacity | latitude | longitude |
|---|---|---|---|
| 200063 - Mexfield Road, East Putney | 89.46055 | 51.45682 | -0.2028021 |
| 200098 - Aintree Street, Fulham | 87.16813 | 51.48102 | -0.2099735 |
| 200174 - Mostyn Grove, Bow | 89.82960 | 51.53054 | -0.0254920 |
| 200177 - Santos Road, Wandsworth | 91.00370 | 51.45706 | -0.2008063 |
| 200186 - Nantes Close, Wandsworth | 88.12582 | 51.46193 | -0.1807918 |
| 200219 - Osiers Road, Wandsworth | 88.06595 | 51.46067 | -0.1987354 |
A subset of bottom_stations_geo.
| name | avg_used_capacity | latitude | longitude |
|---|---|---|---|
| 001114 - Pont Street, Knightsbridge | 25.47857 | 51.49689 | -0.1612038 |
| 002695 - Cadogan Gardens, Chelsea | 24.60526 | 51.49246 | -0.1599190 |
| 003420 - Sedding Street, Sloane Square | 18.56592 | 51.49313 | -0.1568760 |
| 022183 - Crinan Street, King’s Cross | 26.13413 | 51.53447 | -0.1222030 |
| 200050 - Naval Row, Blackwall | 24.27386 | 51.50898 | -0.0069900 |
| 200118 - Parkway, Camden Town | 21.89531 | 51.53807 | -0.1446640 |
Now that we have geocoded top and bottom stations, we can plot them in a map.
## Source : https://maps.googleapis.com/maps/api/staticmap?center=London&zoom=12&size=640x640&scale=2&maptype=terrain
## Source : https://maps.googleapis.com/maps/api/geocode/json?address=London
## Source : http://tile.stamen.com/terrain-lines/12/2045/1360.png
## Source : http://tile.stamen.com/terrain-lines/12/2046/1360.png
## Source : http://tile.stamen.com/terrain-lines/12/2047/1360.png
## Source : http://tile.stamen.com/terrain-lines/12/2045/1361.png
## Source : http://tile.stamen.com/terrain-lines/12/2046/1361.png
## Source : http://tile.stamen.com/terrain-lines/12/2047/1361.png
## Source : http://tile.stamen.com/terrain-lines/12/2045/1362.png
## Source : http://tile.stamen.com/terrain-lines/12/2046/1362.png
## Source : http://tile.stamen.com/terrain-lines/12/2047/1362.png
## Source : http://tile.stamen.com/terrain-lines/12/2045/1363.png
## Source : http://tile.stamen.com/terrain-lines/12/2046/1363.png
## Source : http://tile.stamen.com/terrain-lines/12/2047/1363.png
## Warning: Removed 2 rows containing missing values (geom_point).
We can plot the top and bottom stations by week day.
## Warning: Removed 7 rows containing missing values (geom_point).
## Warning: Removed 12 rows containing missing values (geom_point).
A subset of ci_df_3.
| name | avg_used_capacity |
|---|---|
| 300079 - Manfred Road, East Putney | 91.54527 |
| 200177 - Santos Road, Wandsworth | 91.00370 |
| 300046 - Alma Road, Wandsworth | 90.65369 |
| 200174 - Mostyn Grove, Bow | 89.82960 |
| 300026 - All Saints’ Road, Portobello | 89.63444 |
| 200063 - Mexfield Road, East Putney | 89.46055 |
A subset of ci_df_3_geo.
| name | avgcapacity | latitude | longitude |
|---|---|---|---|
| 000959 - Milroy Walk, South Bank | 33.42063 | 51.50724 | -0.1062375 |
| 000960 - Hop Exchange, The Borough | 56.37899 | 51.50463 | -0.0917738 |
| 000961 - Union Street, The Borough | 44.98475 | 51.50369 | -0.0984977 |
| 000962 - Stamford Street, South Bank | 52.78369 | 51.50557 | -0.1116067 |
| 000963 - Bankside Mix, Bankside | 44.83317 | 51.50582 | -0.1001863 |
| 000964 - Bath Street, St. Luke’s | 61.87755 | 51.52589 | -0.0908478 |
## Warning: Removed 27 rows containing missing values (geom_point).
Adding up the empty slots of stations can give us a good proxy of total bikes being used.
## hour bicis_circulando city
## 1 10 430339 london
## 2 8 420704 london
## 3 11 403440 london
## 4 12 381153 london
## 5 9 379578 london
## 6 13 379430 london
A subset of circulando_df.
| hour | bicis_circulando | city |
|---|---|---|
| 10 | 430339 | london |
| 8 | 420704 | london |
| 11 | 403440 | london |
| 12 | 381153 | london |
| 9 | 379578 | london |
| 13 | 379430 | london |
And then we can plot it.
We can also incorporate population data.
## 'data.frame': 8 obs. of 1 variable:
## $ unique(all_data$company_id): Factor w/ 8 levels "barcelona","budapest",..: 3 4 1 5 7 8 6 2
## city population
## 1 dublin 553165
## 2 helsinki 630752
## 3 barcelona 1604555
## 4 istanbul 14025646
## 5 milano 1347707
## 6 paris 2249975
A subset of ci.
| london |
A subset of circulando_df.
| city | population |
|---|---|
| dublin | 553165 |
| helsinki | 630752 |
| barcelona | 1604555 |
| istanbul | 14025646 |
| milano | 1347707 |
| paris | 2249975 |
And add a per_mil_capita metric.
A subset of cities_df_pop.
| city | hour | bicis_circulando | population | popu_mil | per_mil_capita |
|---|---|---|---|---|---|
| london | 10 | 430339 | 8673713 | 8673.713 | 49.61416 |
| london | 8 | 420704 | 8673713 | 8673.713 | 48.50333 |
| london | 11 | 403440 | 8673713 | 8673.713 | 46.51295 |
| london | 12 | 381153 | 8673713 | 8673.713 | 43.94346 |
| london | 9 | 379578 | 8673713 | 8673.713 | 43.76188 |
| london | 13 | 379430 | 8673713 | 8673.713 | 43.74482 |
Let’s plot bikes per mil capita vs average used capacity.
A subset of cities_df_ave.
| hour | city | bicis_circulando | population | popu_mil | per_mil_capita | ave_cap_h |
|---|---|---|---|---|---|---|
| 1 | london | 33592 | 8673713 | 8673.713 | 3.872851 | 55.21850 |
| 2 | london | 33583 | 8673713 | 8673.713 | 3.871814 | 55.20689 |
| 3 | london | 33618 | 8673713 | 8673.713 | 3.875849 | 55.23499 |
| 4 | london | 33670 | 8673713 | 8673.713 | 3.881844 | 55.30567 |
| 5 | london | 34156 | 8673713 | 8673.713 | 3.937875 | 56.09910 |
| 6 | london | 47064 | 8673713 | 8673.713 | 5.426050 | 57.85625 |
| 7 | london | 270192 | 8673713 | 8673.713 | 31.150673 | 57.91823 |
| 8 | london | 420704 | 8673713 | 8673.713 | 48.503334 | 57.56097 |
| 9 | london | 379578 | 8673713 | 8673.713 | 43.761881 | 55.37769 |
| 10 | london | 430339 | 8673713 | 8673.713 | 49.614162 | 55.05346 |
| 11 | london | 403440 | 8673713 | 8673.713 | 46.512952 | 55.18604 |
| 12 | london | 381153 | 8673713 | 8673.713 | 43.943465 | 54.91695 |
| 13 | london | 379430 | 8673713 | 8673.713 | 43.744818 | 54.84079 |
| 14 | london | 300851 | 8673713 | 8673.713 | 34.685376 | 54.72947 |
| 15 | london | 289957 | 8673713 | 8673.713 | 33.429398 | 54.99840 |
| 16 | london | 151349 | 8673713 | 8673.713 | 17.449159 | 57.33982 |
| 17 | london | 105286 | 8673713 | 8673.713 | 12.138516 | 57.58917 |
| 18 | london | 68430 | 8673713 | 8673.713 | 7.889355 | 56.21981 |
| 19 | london | 34129 | 8673713 | 8673.713 | 3.934762 | 56.00120 |
| 20 | london | 33992 | 8673713 | 8673.713 | 3.918968 | 55.76749 |
| 21 | london | 33805 | 8673713 | 8673.713 | 3.897408 | 55.60826 |
| 22 | london | 44921 | 8673713 | 8673.713 | 5.178982 | 55.50966 |
| 23 | london | 33686 | 8673713 | 8673.713 | 3.883689 | 55.37428 |
A subset of top_stations_geo.
| name | avg_used_capacity | latitude | longitude |
|---|---|---|---|
| 200063 - Mexfield Road, East Putney | 89.46055 | 51.45682 | -0.2028021 |
| 200098 - Aintree Street, Fulham | 87.16813 | 51.48102 | -0.2099735 |
| 200174 - Mostyn Grove, Bow | 89.82960 | 51.53054 | -0.0254920 |
| 200177 - Santos Road, Wandsworth | 91.00370 | 51.45706 | -0.2008063 |
| 200186 - Nantes Close, Wandsworth | 88.12582 | 51.46193 | -0.1807918 |
| 200219 - Osiers Road, Wandsworth | 88.06595 | 51.46067 | -0.1987354 |
| 200220 - Blythe Road West, Shepherd’s Bush | 88.22927 | 51.49968 | -0.2217916 |
| 300026 - All Saints’ Road, Portobello | 89.63444 | 51.51904 | -0.2047644 |
| 300046 - Alma Road, Wandsworth | 90.65369 | 51.45799 | -0.1848062 |
| 300079 - Manfred Road, East Putney | 91.54527 | 51.45816 | -0.2060025 |
A subset of bottom_stations_geo.
| name | avg_used_capacity | latitude | longitude |
|---|---|---|---|
| 001114 - Pont Street, Knightsbridge | 25.47857 | 51.49689 | -0.1612038 |
| 002695 - Cadogan Gardens, Chelsea | 24.60526 | 51.49246 | -0.1599190 |
| 003420 - Sedding Street, Sloane Square | 18.56592 | 51.49313 | -0.1568760 |
| 022183 - Crinan Street, King’s Cross | 26.13413 | 51.53447 | -0.1222030 |
| 200050 - Naval Row, Blackwall | 24.27386 | 51.50898 | -0.0069900 |
| 200118 - Parkway, Camden Town | 21.89531 | 51.53807 | -0.1446640 |
| 200193 - Snowsfields, London Bridge | 23.56908 | 51.50215 | -0.0836329 |
| 200238 - Westferry DLR, Limehouse | 24.47227 | 51.50930 | -0.0259960 |
| 300022 - Evesham Street, Avondale | 21.96792 | 51.51149 | -0.2181902 |
| 300234 - Stratford Station, Stratford | 25.49022 | 51.54179 | -0.0048100 |
A subset of ci_df.
| empty_slots | free_bikes | id | latitude | longitude | name | timestamp | company_id | time_2 | week_day | hour | shift | total_slots | empty_status | empty_status_label |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 5 | 11 | d79e95513c3b5e1d3c78a65874054481 | 51.49887 | -0.1374246 | 200178 - Buckingham Gate, Westminster | 2016-10-17T14:08:07.773000Z | london | 2016-10-17 14:08:07 | Mon | 14 | morning | 16 | 31.25000 | balanced |
| 20 | 0 | 090f297e79c030688a8a51755ebeafe3 | 51.50172 | -0.1798542 | 001231 - Queen’s Gate, Kensington Gardens | 2016-10-17T14:08:06.788000Z | london | 2016-10-17 14:08:06 | Mon | 14 | morning | 20 | 100.00000 | shortage |
| 9 | 9 | 7f3020118e56165ed8b2f61899edb971 | 51.52916 | -0.1099705 | 001023 - River Street , Clerkenwell | 2016-10-17T14:08:04.972000Z | london | 2016-10-17 14:08:04 | Mon | 14 | morning | 18 | 50.00000 | balanced |
| 13 | 22 | 67e6c16bce05410ba4b1f0f5000726ea | 51.49961 | -0.1975742 | 001018 - Phillimore Gardens, Kensington | 2016-10-17T14:08:04.980000Z | london | 2016-10-17 14:08:04 | Mon | 14 | morning | 35 | 37.14286 | balanced |
| 15 | 8 | d0ca154129878ce71261d563d8da9238 | 51.53006 | -0.1209737 | 001013 - St. Chad’s Street, King’s Cross | 2016-10-17T14:08:04.989000Z | london | 2016-10-17 14:08:04 | Mon | 14 | morning | 23 | 65.21739 | balanced |
| 7 | 19 | 8e74e96bcc8d74d7621f8201cee921dd | 51.49313 | -0.1568760 | 003420 - Sedding Street, Sloane Square | 2016-10-17T14:08:04.993000Z | london | 2016-10-17 14:08:04 | Mon | 14 | morning | 26 | 26.92308 | balanced |
After a succint analysis, it’s time to create our prediction model. The goal is to predict a station status (overflow, balanced or shortage) in a given week day and hour.
## 'data.frame': 271950 obs. of 11 variables:
## $ empty_slots : int 5 20 9 13 15 7 6 16 13 3 ...
## $ free_bikes : int 11 0 9 22 8 19 12 0 5 21 ...
## $ latitude : num 51.5 51.5 51.5 51.5 51.5 ...
## $ longitude : num -0.137 -0.18 -0.11 -0.198 -0.121 ...
## $ hour : int 14 14 14 14 14 14 14 14 14 14 ...
## $ empty_status : int 31 100 50 37 65 26 33 100 72 12 ...
## $ empty_status_label: Factor w/ 3 levels "overflow","balanced",..: 2 3 2 2 2 2 2 3 2 1 ...
## $ id : chr "d79e95513c3b5e1d3c78a65874054481" "090f297e79c030688a8a51755ebeafe3" "7f3020118e56165ed8b2f61899edb971" "67e6c16bce05410ba4b1f0f5000726ea" ...
## $ name : chr "200178 - Buckingham Gate, Westminster" "001231 - Queen's Gate, Kensington Gardens" "001023 - River Street , Clerkenwell" "001018 - Phillimore Gardens, Kensington" ...
## $ shift : Factor w/ 3 levels "night","morning",..: 2 2 2 2 2 2 2 2 2 2 ...
## $ week_day : Ord.factor w/ 7 levels "Sun"<"Mon"<"Tues"<..: 2 2 2 2 2 2 2 2 2 2 ...
## [1] 271950 11
Now I will normalize data so it can fit in a prediction model.
##
## overflow balanced shortage
## 57173 126060 88717
A subset of df1_norm.
| empty_slots | free_bikes | latitude | longitude | hour |
|---|---|---|---|---|
| 0.0793651 | 0.1746032 | 0.4662308 | 0.4236568 | 0.5909091 |
| 0.3174603 | 0.0000000 | 0.4963503 | 0.2427163 | 0.5909091 |
| 0.1428571 | 0.1428571 | 0.7864481 | 0.5407341 | 0.5909091 |
| 0.2063492 | 0.3492063 | 0.4740658 | 0.1671494 | 0.5909091 |
| 0.2380952 | 0.1269841 | 0.7959171 | 0.4938113 | 0.5909091 |
| 0.1111111 | 0.3015873 | 0.4056110 | 0.3407064 | 0.5909091 |
Basically the model will classify a station depending on the number of free slots, number of available bikes, latitude, longitude and hour of the day. Split the dataset in two sets, 70% for training and 30% for testing.
## [1] 190365 5
## [1] 81585 5
Let’s train and test the model. I will use knn with k = 21.
##
##
## Cell Contents
## |-------------------------|
## | N |
## | Chi-square contribution |
## | N / Row Total |
## | N / Col Total |
## | N / Table Total |
## |-------------------------|
##
##
## Total Observations in Table: 81585
##
##
## | df1_test_pred_2
## df1_test_labels | overflow | balanced | shortage | Row Total |
## ----------------|-----------|-----------|-----------|-----------|
## overflow | 16475 | 573 | 0 | 17048 |
## | 47359.194 | 6830.008 | 5577.978 | |
## | 0.966 | 0.034 | 0.000 | 0.209 |
## | 0.974 | 0.015 | 0.000 | |
## | 0.202 | 0.007 | 0.000 | |
## ----------------|-----------|-----------|-----------|-----------|
## balanced | 444 | 36912 | 584 | 37940 |
## | 7005.008 | 20993.024 | 11273.158 | |
## | 0.012 | 0.973 | 0.015 | 0.465 |
## | 0.026 | 0.972 | 0.022 | |
## | 0.005 | 0.452 | 0.007 | |
## ----------------|-----------|-----------|-----------|-----------|
## shortage | 0 | 487 | 26110 | 26597 |
## | 5515.654 | 11424.166 | 34821.290 | |
## | 0.000 | 0.018 | 0.982 | 0.326 |
## | 0.000 | 0.013 | 0.978 | |
## | 0.000 | 0.006 | 0.320 | |
## ----------------|-----------|-----------|-----------|-----------|
## Column Total | 16919 | 37972 | 26694 | 81585 |
## | 0.207 | 0.465 | 0.327 | |
## ----------------|-----------|-----------|-----------|-----------|
##
##
A subset of df1_test.
| empty_slots | free_bikes | latitude | longitude | hour | empty_status | empty_status_label | id | name | shift | week_day | label_pred | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 5 | 11 | 51.49887 | -0.1374246 | 14 | 31 | balanced | d79e95513c3b5e1d3c78a65874054481 | 200178 - Buckingham Gate, Westminster | morning | Mon | balanced |
| 2 | 20 | 0 | 51.50172 | -0.1798542 | 14 | 100 | shortage | 090f297e79c030688a8a51755ebeafe3 | 001231 - Queen’s Gate, Kensington Gardens | morning | Mon | shortage |
| 4 | 13 | 22 | 51.49961 | -0.1975742 | 14 | 37 | balanced | 67e6c16bce05410ba4b1f0f5000726ea | 001018 - Phillimore Gardens, Kensington | morning | Mon | balanced |
| 10 | 3 | 21 | 51.52395 | -0.1225023 | 14 | 12 | overflow | 1eabd7ac8e781befd03f52ef56a18aa7 | 001022 - Brunswick Square, Bloomsbury | morning | Mon | overflow |
| 13 | 42 | 2 | 51.52994 | -0.1236168 | 14 | 95 | shortage | 025ecd9d4f9708f515a36660227c8f3f | 001011 - Belgrove Street , King’s Cross | morning | Mon | shortage |
| 16 | 21 | 9 | 51.52505 | -0.1311611 | 14 | 70 | balanced | ebe16a8a0722ff26b583e2bda9cd1af4 | 001009 - Taviton Street, Bloomsbury | morning | Mon | balanced |
As you can see in the previous table, the model predicts quite well.
##
## overflow balanced shortage
## overflow 16475 573 0
## balanced 444 36912 584
## shortage 0 487 26110
A subset of df1_test.
| empty_slots | free_bikes | latitude | longitude | hour | empty_status | empty_status_label | id | name | shift | week_day | label_pred | pred | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 5 | 11 | 51.49887 | -0.1374246 | 14 | 31 | balanced | d79e95513c3b5e1d3c78a65874054481 | 200178 - Buckingham Gate, Westminster | morning | Mon | balanced | 1 |
| 2 | 20 | 0 | 51.50172 | -0.1798542 | 14 | 100 | shortage | 090f297e79c030688a8a51755ebeafe3 | 001231 - Queen’s Gate, Kensington Gardens | morning | Mon | shortage | 1 |
| 4 | 13 | 22 | 51.49961 | -0.1975742 | 14 | 37 | balanced | 67e6c16bce05410ba4b1f0f5000726ea | 001018 - Phillimore Gardens, Kensington | morning | Mon | balanced | 1 |
| 10 | 3 | 21 | 51.52395 | -0.1225023 | 14 | 12 | overflow | 1eabd7ac8e781befd03f52ef56a18aa7 | 001022 - Brunswick Square, Bloomsbury | morning | Mon | overflow | 1 |
| 13 | 42 | 2 | 51.52994 | -0.1236168 | 14 | 95 | shortage | 025ecd9d4f9708f515a36660227c8f3f | 001011 - Belgrove Street , King’s Cross | morning | Mon | shortage | 1 |
| 16 | 21 | 9 | 51.52505 | -0.1311611 | 14 | 70 | balanced | ebe16a8a0722ff26b583e2bda9cd1af4 | 001009 - Taviton Street, Bloomsbury | morning | Mon | balanced | 1 |
## [1] 2088 13
A subset of df1_test_wrong.
| empty_slots | free_bikes | latitude | longitude | hour | empty_status | empty_status_label | id | name | shift | week_day | label_pred | pred | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 34 | 14 | 5 | 51.52334 | -0.1838464 | 14 | 73 | balanced | ee5b657420a4bbb92081e8a2b7a401fc | 001027 - Warwick Avenue Station, Maida Vale | morning | Mon | shortage | 0 |
| 278 | 10 | 3 | 51.52289 | -0.1716820 | 14 | 76 | shortage | 8b1f02423f3951fd6c337014465d7f7f | 010625 - Penfold Street, Marylebone | morning | Mon | balanced | 0 |
| 279 | 4 | 12 | 51.51689 | -0.1582499 | 14 | 25 | overflow | 23ee76ce7a5418e315a92a4f79ea12a2 | 010630 - George Place Mews, Marylebone | morning | Mon | balanced | 0 |
| 742 | 4 | 12 | 51.50501 | -0.1727296 | 14 | 25 | overflow | 5fb0ac0fdd9f54ba0b041d8ebba61c58 | 001217 - Serpentine Car Park, Hyde Park | morning | Mon | balanced | 0 |
| 792 | 6 | 19 | 51.52166 | -0.1090063 | 14 | 24 | overflow | fbd79d7e9e1996ed0e9955ed4f31b33e | 001008 - Hatton Wall, Holborn | morning | Mon | balanced | 0 |
| 852 | 9 | 24 | 51.51159 | -0.0771213 | 14 | 27 | balanced | 6be38ccc37cd0c70d4dabccf654dc86f | 000991 - Crosswall, Tower | morning | Mon | overflow | 0 |
Finally we can create a simple tool to search a given station and see its real and predicted status in a given day and hour.
## [1] 778
A subset of places.
| d79e95513c3b5e1d3c78a65874054481 |
| 090f297e79c030688a8a51755ebeafe3 |
| 67e6c16bce05410ba4b1f0f5000726ea |
| 1eabd7ac8e781befd03f52ef56a18aa7 |
| 025ecd9d4f9708f515a36660227c8f3f |
| ebe16a8a0722ff26b583e2bda9cd1af4 |
| 3704deafdc9a47f1a3b445511be1477b |
| 43a7f48d18b5950311a758a86869c23b |
| 0c01f449802222cdd504c8eafa0322a4 |
| ad43baa18c9dbea9ba4b0966c386cc0c |
A subset of places_names.
| 200178 - Buckingham Gate, Westminster |
| 001231 - Queen’s Gate, Kensington Gardens |
| 001018 - Phillimore Gardens, Kensington |
| 001022 - Brunswick Square, Bloomsbury |
| 001011 - Belgrove Street , King’s Cross |
| 001009 - Taviton Street, Bloomsbury |
| 003425 - Northington Street , Holborn |
| 000981 - British Museum, Bloomsbury |
| 001019 - Ampton Street , Clerkenwell |
| 001029 - Windsor Terrace, Hoxton |
##
## overflow balanced shortage
## overflow 0 0 0
## balanced 0 1 0
## shortage 0 0 0
A subset of selected_place.
| empty_slots | free_bikes | latitude | longitude | hour | empty_status | empty_status_label | id | name | shift | week_day | label_pred | pred |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 6 | 12 | 51.51812 | -0.1442289 | 11 | 33 | balanced | 7c43ac2c8fefa2a6e3e0e6b57654e7cf | 003424 - Broadcasting House, Marylebone | morning | Mon | balanced | 1 |